Equivalence, non-inferiority and superiority testing

an interactive visualization

Kristoffer's LinkedIn profile

It is not uncommon to see researchers conclude that two treatments are equally effective, based on an insignificant test of the null hypothesis. Or that reducing the length of a treatment yields treatment effects that are no worse than the standard (longer) treatment, based on p > 0.05. Clearly, both conclusions are wrong. Much has been written about this, and in medicine the appropriate types of tests for these kinds of hypotheses are equivalence and non-inferiority tests. When testing for equivalence, we test whether a treatment effect is inside a prespecified equivalence margin [-Δ, Δ]. Similarly, when testing if a treatment is at least not worse than another treatment, we test if the effect is above a prespecified non-inferiority margin -Δ. My aim with this visualization is to show the decision rules associated with these different types of hypotheses. This visualization also shows how power relates to the different tests and different values of Δ, d and n.

Below I use a 95 % confidence interval to demonstrate the different hypotheses. You can move the CI around using the sliders or by clicking and dragging. Results of the test of treatment differences will automatically be highlighted.

Settings

Observed effect (d = 1)

-1.5

-1

0.5

1.5

Sample size (n = 10)

100

Margin (Δ = 0.3)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Effect of new treatment is inconclusive

95 % CI

Power

Superiority

Non-inferiority

Equivalence

H₀: d = 0

H_a: d > 0

H₀: d ≤ -Δ

H_a: d > -Δ

H₀: d ≤ -Δ or d ≥ Δ

H_a: -Δ < d < Δ

Technical notes

Power is calculated using the following power functions. Note that α is 0.025 for all tests, since a 95 % CI is used. Also note that normal approximations are used. So power will be slightly off for really small sample sizes.

Power of equivalence test

Power of non-inferiority test

Power of superiority test

where is the cumulative distribution function of the standard normal distribution. is Cohen's d, is non-inferiority or equivalence margin, is the sample size per group, and is the is the 100()th percentile of a standard normal distribution.

Formulas are adapted from Julious, Steven A. "Sample sizes for clinical trials with normal data." Statistics in medicine 23.12 (2004): 1921-1986.

Type I error

Non-inferiority is shown if the lower side of a two-sided (1–2α)×100% CI is above -Δ. In this case that means a 95 % CI, so the significance level is 0.025. Using the two one-sided test (TOST) procedure, equivalence is tested using a (1–2α)×100% CI. In this case this significance level is also 0.025. In the visualization superiority testing is also performed as a one tailed test, also with a significance level of 0.025. So if we wanted to use a 0.05 significance level we would use 90 % CIs.

More visualizations

Support my work

The content on this blog is shared for free under a CC-BY license. If you like my work and want to support it you can:

Buy me a coffee (or use PayPal)

You can also sponsor my open source work using GitHub Sponsors

Suggestions, errors, and bugs

Have any suggestion? Or found any bugs? Send them to me, my contact info can be found here.

Equivalence, non-inferiority and superiority testing

an interactive visualization

Settings

Effect of new treatment is inconclusive

95 % CI

Power

Superiority

Non-inferiority

Equivalence

H₀: d = 0

H_a: d > 0

H₀: d ≤ -Δ

H_a: d > -Δ

H₀: d ≤ -Δ or d ≥ Δ

H_a: -Δ < d < Δ

Technical notes

Power of equivalence test

Power of non-inferiority test

Power of superiority test

Type I error

More visualizations

Cohen's d

NHST

Interpret Confidence Intervals

Even more ...

Support my work

Suggestions, errors, and bugs

About Site

Connect

Equivalence, non-inferiority and superiority testing

an interactive visualization

Settings

Effect of new treatment is inconclusive

95 % CI

Power

Superiority

Non-inferiority

Equivalence

H0: d = 0

Ha: d > 0

H0: d ≤ -Δ

Ha: d > -Δ

H0: d ≤ -Δ or d ≥ Δ

Ha: -Δ < d < Δ

Technical notes

Power of equivalence test

Power of non-inferiority test

Power of superiority test

Type I error

More visualizations

Cohen's d

NHST

Interpret Confidence Intervals

Even more ...

Support my work

Suggestions, errors, and bugs

About Site

Connect

H₀: d = 0

H_a: d > 0

H₀: d ≤ -Δ

H_a: d > -Δ

H₀: d ≤ -Δ or d ≥ Δ

H_a: -Δ < d < Δ